Not Change-point analysis
Measure change effect
Signal vs noise
Want generic technique
Binary outcome (0 or 1)
Monthly summaries
Sales due to faster turnaround
\[
P(A | B) = \frac{P(B|A) P(A)}{P(B)}
\]
\[
p(\theta | D) \propto \int p(D | \theta) \, p(\theta) \ d\theta
\]
where
Single trial:
\[ p(y|\theta) = \theta^y (1 - \theta)^{1-y} \]
\(n\) trials, \(k\) successes:
\[ p(k | \theta) = \binom{n}{k} \, \theta^k (1 - \theta)^{n-k} \]
\[
p(\theta) = Beta(\alpha, \beta)
\]
\[
p(\theta | D) = Beta(\alpha + k, \beta + n - k)
\]
Generate data
Calculate yearly posterior distributions
Graph it
Had 500 calls per month
Treat as Poisson process
\[
C \sim Pois(500)
\]
Add noise to the underlying rate?
Aggregate yearly
Look at yearly conversions
Why the discrepancy in outputs?
Quantity of data accumulates
Prior very strong
Need to rethink priors
Prior represents knowledge
How confident are we?
Balancing act
Estimate \(\theta\), assign a strength
Reparameterise \(Beta(\alpha, \beta)\)
\[
Beta(\alpha, \beta) \rightarrow Beta(\mu K, (1 - \mu) K)
\]
stoc_count_tbl %>%
filter(rate_date < as.Date('2016-01-01')) %>%
summarise(conv_count = sum(conversion_count)
,call_count = sum(call_count)
,rate = conv_count / call_count
)## # A tibble: 1 x 3
## conv_count call_count rate
## <int> <int> <dbl>
## 1 3477 35934 0.0967607
Assume 1 year of ‘strength’
\(K = 12 \times 500 = 6,000\)
\[
\mathcal{N}(0.10, 0.02) \rightarrow \mathcal{N}(0.15, 0.02)
\]
Mean-shift well outside variance
\[
\mathcal{N}(0.40, 0.08) \rightarrow \mathcal{N}(0.45, 0.08)
\]
Can we see difference?
Very hard to spot a change!
How can we quantify differences?
A metric or distance:
\[
D(P, Q) = \int^1_0 \text{min}(P(x), Q(x)) \, dx
\]
\[
D_{KL}(P||Q) = \int^1_0 p(x) \ln \frac{p(x)}{q(x)} \, dx
\]
Not symmetric
No triangle inequality
Intuitive information theory interpretation
\[
H^2(P, Q) = 1 - \int \sqrt{p(x) q(x)} \, dx
\]
\[ 0 \leq H(P, Q) \leq 1 \]
\[ H^2(P, Q) \leq \delta(P, Q) \leq \sqrt{2} H(P, Q) \]
\[
\mu = 0.10 \;\; K_1 = 6,000 \;\; K_2 = 7,000 \;\; K_3 = 12,000
\]
calculate_metrics(x_seq, Beta1, Beta1) %>% print(digits = 2)## commonarea hellinger kl
## 4.4e-16 4.4e-16 0.0e+00
calculate_metrics(x_seq, Beta1, Beta2) %>% print(digits = 2)## commonarea hellinger kl
## 0.0373 0.0015 0.0063
calculate_metrics(x_seq, Beta1, Beta3) %>% print(digits = 2)## commonarea hellinger kl
## 0.166 0.029 0.153
Fix \(\mu_1\)
Have 1 year of data as prior, \(K_1 = 6,000\)
Set new \(\mu_2\)
Check distribution:
Two months, \(K_2 = 7,000\); one year, \(K_3 = 12,000\)
calculate_metrics(x_seq, Beta1, Beta1) %>% print(digits = 4)## commonarea hellinger kl
## 4.441e-16 4.441e-16 0.000e+00
calculate_metrics(x_seq, Beta1, Beta2) %>% print(digits = 4)## commonarea hellinger kl
## 0.15522 0.01974 0.08640
calculate_metrics(x_seq, Beta1, Beta3) %>% print(digits = 4)## commonarea hellinger kl
## 0.5592 0.2624 1.8189
calculate_metrics(x_seq, Beta1, Beta1) %>% print(digits = 4)## commonarea hellinger kl
## 4.441e-16 4.441e-16 0.000e+00
calculate_metrics(x_seq, Beta1, Beta2) %>% print(digits = 4)## commonarea hellinger kl
## 0.6553 0.3602 1.9564
calculate_metrics(x_seq, Beta1, Beta3) %>% print(digits = 4)## commonarea hellinger kl
## 0.9997 0.9981 39.1994
What about our data?
We have monthly call data
Have posterior distributions
Calculate metrics as data updates
Binomial process with known change point
Model with Beta distribution
Aggregate data appropriately
Distribution plots and f-divergence metrics
Decide on thresholds
Try with other processes / distributions
More comprehensive behaviour investigation
Look at statistical distance
Time-series methods
https://github.com/kaybenleroll/dublin_r_workshops
Blog post:
http://blog.applied.ai/a-bayesian-approach-to-monitoring-process-change/